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E-book readers (left to right): PRS-500 (Sony, 2006), PRS-505 (Sony, 2007), Kindle 1 (Amazon, 2007). 
Below: Kindle 2 (Amazon, 2009), Kindle DX (Amazon, 2009), iPad (Apple, 2010) 

Source: John Blyberg, Wikimedia Commons . 







EXECUTIVE SUMMARY 



This report describes a conversion experiment and subsequent reader survey 
conducted bv ACLS Humanities E-Book (HEB) in late 2009 and early 2010 to assess 
the viability of using scholarly monographs with handheld e-readers. Scholarly 
content generally involves extensive networking and cross-referencing between 
individual works through various channels, including bibliographical citation and 
subsequent analysis and discussion. Through past experience with its online 
collection, HEB had already determined that a web-based platform lends itself 
well to presenting this type of material, but was interested in exploring which key 
elements would need to be replicated in the handheld edition in order to maintain the 
same level of functionality, as well as what specific factors from either print or digital 
publishing would have to be taken into account. As sample content, HEB selected six 
titles from its own online collection, three in a page-image format with existing OCR- 
derived text and three encoded as XML files, and had these converted by an outside 
vendor with minimal editorial intervention into both MOBI (prc) and ePub files. 

During its in-house assessment phase, HEB experienced some navigational 
difficulty with both formats and found that annotation and other interaction with the 
text was difficult using a number of popular e-readers. (Specifically, the sample 
titles were tested by HEB on the Sony Reader PRS-700, Amazon’s Kindle 2 and 
the Stanza application on the Apple iPhone.) HEB also found the XML titles to be 
of limited functionality in the MOBI format and therefore opted not to further poll 
readers on this subset. 

About 88% of our 142 survey participants expressed overall satisfaction with the 
appearance and functionality of the three remaining handheld samples, although 
roughly half reported some level of frustration with the search function using either 
format, and only 26% felt they would have an easy time citing and referencing 
these editions. Satisfaction with other interactive features, such as adding notes, 
bookmarking and highlighting, was noticeably higher; however, the “n/a” option was 
also selected frequently for these categories, and it appears that a large number 
of participants were unable to perform the tasks in question due to confusing or 
insufficient instructions from the device manufacturer. As formats evolve, future 
satisfaction with these features may increase. Irrespective of specific limitations, 
75% of participants were interested in potentially downloading additional similar 
titles for free or if priced below $1 0. 

HEB’s production costs, starting from preexisting OCR-derived text and XML files, 
amounted to about $204 per title for creating both editions, ePub and MOBI. As 
an example for other publishers, were we to process 300 additional titles from our 
online collection, this would rise to about $232 (for a bulk conversion of page-image 
titles only, which are somewhat more expensive to convert than XML). Therefore, if 
titles were sold at $1 0, production costs would be offset at twenty-four downloads. 
This data is included to provide publishers with a basic idea of conversion costs 
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from one digital format to another; however, it does not take into account other 
ordinary overhead charges or management fees and discounts for third-party 
retailers and distributors, which would need to be factored in separately. 

HEB’s initial findings in this study indicate that titles formatted for existing handheld 
devices are not yet adequate for scholarly use in terms of replicating either the 
benefits of online collections — cross-searchability, archiving, multifarious interactive 
components — nor certain aspects of print editions that users reported missing, such 
as being able to mark up and rapidly skim text. A turnaround is underway once a 
common and more robust format optimized for handheld readers is determined and 
devices themselves evolve, adding improved display options and better and more 
intuitive web-access, searching and other interactive use of content. 



INTRODUCTION: ONLINE VERSUS HANDHELD 



E-book readers and e-reader applications for smart phones and PDAs have been 
steadily gaining in popularity over the last few years, as the pervasive coverage 
in both technology-oriented publications and in the mainstream press confirms, 
and it’s becoming increasingly difficult to keep up with the release of new devices 
and improvements on existing platforms. At a somewhat slower pace than in 
commercial markets, handheld reading is also gaining a foothold in academia. 
During the last few years, ACLS Humanities E-Book (HEB) has received an 
increasing number of inquiries from its subscribers and other interested parties 
wondering whether we were planning to offer titles from our collection for download 
in formats optimized for this new wave of handheld e-book readers. 

Currently, HEB subscribers are able to access our nearly 2,800 digital titles 1 — 
spanning dozens of disciplines, as well as multiple discrete series — online via 
standard web browsers, in full-text editions and fully cross-searchable. Titles may 
be viewed in multiple iterations, including the default scanned page-image view 
at various magnifications, a PDF (portable document format) view that allows for 
printing of three consecutive pages and an unformatted OCR (optical character 
recognition)-derived text view. The books are hosted at the University of Michigan 
Library, whose Scholarly Publishing Office disseminates and provides maintenance 
for the collection, with limited options for downloading and printing and no capability 
for transferring files off the library’s servers to personal computers or portable 
readers. Keeping in mind rights restrictions and our subscription-based access 
model, there seemed to be no immediate practical route to switching to downloadable 
books, and we conveyed as much to our subscribers whenever queried. 

However, upon attempting to delve further into the subject of downloadable monographs, 
HEB discovered that little had been rigorously studied or published to date regarding 
the suitability of handheld e-reader devices for disseminating content intended 

1 . For more information on title selection and to download a spreadsheet listing all current titles, please 
visit: http://www.humanitiesebook.org/titlelist.html. 
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specifically for scholarly research. 2 In contrast to trade publishing, where individual titles 
(or series) are often more or less self-contained, scholarly content generally involves 
extensive networking enabling individual works to “speak” to one another, be it through 
bibliographical citation and reference or through subsequent analysis and discussion. It 
therefore makes sense that online aggregation lends itself well to presenting this type of 
material in digital form, as we already knew from HEB’s seven years of online publishing 
experience since launching the collection in 2002. But which key features of this 
successful model would need to be replicated in the handheld environment in order to 
produce useful results for the scholarly community, and what specific factors from either 
print or digital publishing would have to be taken into account? 

These questions, in conjunction with HEB’s commitment to periodically reevaluating 
the utility and longevity of its collection by exploring different e-book formats, prompted 
us in fall 2009 to select a small sample of titles for conversion in order to conduct a 
limited, controlled study to assess this content on then-current handheld devices. 
Since the HEB collection is widely known and subscribed to and includes high-quality 
titles recommended and reviewed by ACLS’s constituent learned societies, it offered a 
consistent and easily analyzed body of works that would allow for efficient comparison 
of publication platforms, reader expectations and requirements between the online 
and handheld environments. We therefore considered our ability to make a small 
but significant contribution to this emerging area to be well worth the time and effort 
expended in this study, which was conceived as a two-part process: an in-house 
evaluation followed by an external-reader survey. 3 



CONVERTING BOOKS FOR HANDHELD DEVICES 



TITLE AND FORMAT SELECTION 

HEB opted to convert three page-image titles and three XML titles from our online 
collection for use in this experiment. The vast majority of HEB’s online titles belong in 
the former category; meaning, they are presented online as scanned page images of 



2. Several campus-based studies of textbooks for use with handheld readers have been published 
— several examples of these are presented in the conclusion below — but we were less interested in 
the largely subjective reactions to the handheld reading experience gathered in these than in a broader 
assessment of which elements of digital scholarly communication could be efficiently and cost-effectively 
presented using then-available devices and software. The Chronicle of Higher Education has since 
conducted its own survey of handheld readers that covers some of the same ground — see note 19 below. 

3. For further reading on the growing importance of electronic resources in general over print books 
to libraries, see the following recent studies: CIBER, The Economic Downturn and Libraries (University 
College London, December 2009), available online at 
http://www.ucl.ac.uk/infostudies/research/ciber/charleston-survev.pdf : 

Michael Newman, The 2009 Librarian eBook Survey {H\gh\N\re Press, 2010), 
http://hiahwire.stanford.edu/PR/HighWireEBookSurvev2010.pdf : 

and Roger C. Schonfeld and Ross Housewright, Faculty Survey 2009: Key Strategic Insights for Libraries, 
Publishers, and Societies (ITHAKA S+R, 201 0), 

http://www.ithaka.ora/ithaka-s-r/research/facultv-survevs-2000-2009/Facultv%20Studv%202009.pdf . 

Note that these reports do not necessarily differentiate among different e-book formats. 
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the print edition, with underlying OCR-derived text for searching. (These associated, 
minimally formatted text files are also provided to readers as a separate viewing option, 
to enable display of highlighted search terms as well as for copying and pasting.) At the 
time, there were also seventy-one text-encoded XML (extensible markup language) 
titles in the collection, which are dynamically transformed into HTML for online viewing. 
These titles can include links and other interactive features whose translation to 
handheld devices we were interested in assessing during the course of our experiment. 
From among the page-image titles, we chose relatively popular books unencumbered 
by rights issues; from among the XML books, we chose titles with some interactive 
components that would not overwhelm the basic textual content, since we were unsure 
how functional the former would ultimately be in the handheld edition . 4 

While pondering which target format was best suited to this experiment, we attempted 
to take into account which types of files were most versatile and universally accessible 
on then-current e-book readers. PDF is supported by nearly every popular device 
on the market and thus fit the bill of accessibility — not to mention that, for many 
publishers, this would probably represent the simplest conversion solution ; 5 yet this 
option seemed limited in terms of interactive content and formatting due to its lack of 
reflowable text and was therefore not exactly in keeping with the nature of our inquiry. 
EPub is an open standard developed by the International Digital Publishing Forum 
(IDPF), frequently cited as the most flexible and one of the most extensively supported 
digital formats currently in use. It is predicted to be adopted even more widely in the 
future, and therefore seemed like an optimal choice. We also took into consideration 
the status of Amazon’s Kindle as the most prominent dedicated handheld reader in 
use at the time, followed by the Sony Reader as a distant second . 6 EPub is compatible 
with the Sony device but not with the Kindle; however, in addition to its proprietary AZW 
format, the Kindle can also display unprotected Mobipocket files , 7 which are closely 



4. The six titles chosen were Norman Daniel’s The Arabs and Mediaeval Europe (London: Longman, 
1975), also part of HEB’s print-on-demand program; Lewis Hanke’s The Spanish Struggle for Justice in 
the Conquest of America (Philadelphia: University of Pennsylvania Press, 1949); Karl Polanyi’s The Great 
Transformation (Boston: Beacon Press, 1957, cl 944); Therese-Adele Husson, Reflections (New York: 

New York University Press, 2003, c2001), which included internal cross-linking at the paragraph level 
between the historical French text and English translation; Fred Nadis, Wonder Shows (New Brunswick, 
NJ: Rutgers University Press, 2006, c2005), which included video files; and Barbara Newman, Voice of the 
Living Light (Berkeley: University of California Press, 2008, cl 998), to test formatting of diverse encoded 
text elements. 

5. For most print publishers, it would likely be easy to obtain PDF output during the design stage, in 
which case minimal further action would need to be taken to prepare the title for digital conversion. HEB’s 
situation is complicated by the fact that, while our page-image titles can all be viewed online in PDF form, 
these are image files only rather than web-optimized files with accessible text. 

6. According to a webinar presented by data-conversion service provider Aptara on November 1 8, 2009, 
“EBook Readers & Standards. . . Where to Next?”, as of October 2009, sales of Kindle and Sony Reader 
devices presented 60% and 35% of year-to-date sales of dedicated readers in the U.S., respectively, with 
only 5% of other devices being purchased. See PDF slideshow summary, available online at: 
http://event.on24.eom/event/1 7/21/63/rt/1/documents/slidepdf/aptara ereader webcast.pdf . 

7. Also known as MOBI, with file extensions .mobi or .prc. 



8 



Handheld E-Book Readers and Scholarship 



related to the former. After soliciting additional input on formats from our vendor, we 
decided to test out both ePub and MOBI. In addition to trying out our sample books on 
the two devices mentioned above, this would also allow us to test them on the Apple 
iPhone (and eventually on the iPad, not yet released when our survey was launched), 
with applications available for viewing both types of formats. 



CONVERSION PROCESS 

HEB established its XML-title specifications (See http://www.humanitiesebook. 
ora/xml/doc/acls-hebook-doc.htmh over several years of praxis and has always 
reviewed and occasionally corrected or augmented files for text-encoded titles in its 
online collection. For this set of handheld editions and at this stage of the learning 
process, it soon became clear that we would not be able to closely examine 
conversion results on a technical backend level and instead would mostly be 
reviewing output, ceding some control over the conversion process and relying in 
large part on our vendor. However, this suited HEB’s interest in keeping editorial 
intervention on the part of our in-house staff to a minimum in order to explore the 
possibility of performing large-scale conversions of additional titles, as we had 
attempted to do with a previous project involving the retroactive conversion of 
backlist page-image titles to XML. 8 

In order to provide our vendor with source files for the conversion of the three page- 
image titles, HEB transmitted the previously generated OCR-derived text files already 
in use online. The OCR process is imperfect, and therefore such files typically have 
an error margin of 0.01%. As a corrective option, HEB asked the vendor to perform 
an automated spell-check on the affected titles, though we were told this would not 
eliminate all possible types of errors. (We knew from past experience with the same 
XML backlist conversion project referenced above that performing individual proofing 
on these books would be prohibitively costly.) For the three XML titles, we submitted 
the XML files tagged in accordance with HEB’s in-house specs. 

Since images already existed as separate related files for our XML titles, we 
submitted these as they were, to be adjusted as needed by the vendor for inclusion 
as figures in the ePub and MOBI editions. For the page-image titles, we provided 
the vendor with a complete list of illustrations so that these could be located in 
the online edition, cropped out of the page scan and subsequently processed. 9 



8. See ACLS Humanities E-Book XML Conversion Experiment: Report on Workflow, Costs, and 
User Preferences (New York: The American Council of Learned Societies, 2009), p. 7, “Description of 
Experiment.” (Available online at http://www.humanitiesebook.org/heb-whitepaper-2.html .) 

9. HEB maintains a title database that, among other functions, tracks the location, by page number, of 
figures in its page-image books for initial scanning purposes. We were therefore able to quickly access and 
export this data. HEB’s needs for applying special scanning techniques for illustrations vary, however, and 
thus some types of images — for example, line art — are less likely to have been originally tracked in this 
manner. In order to be absolutely sure all illustrations are accounted for we would need to double-check 
each book again by hand and preferably in the future list the total number of illustrations to be included in 
the handheld edition in a separate database field. 
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